AF: Small: Collaborative Research: On-Line Learning Algorithms for Path Experts with Non-Additive Losses
ثبت نشده
چکیده
On-line learning algorithms are increasingly adopted as the key solution to modern learning applications with very large data sets of several hundred million or billion points. These algorithms process one sample at a time with an update per iteration that is often computationally cheap and easy to implement. As a result, they are substantially more efficient both in time and space than standard batch learning algorithms with optimization solutions often costly in time and space requirements prohibitive for large data sets. Furthermore, on-line learning algorithms benefit from a rich theoretical analysis with learning bounds that are often very tight [Cesa-Bianchi and Lugosi, 2006]. A standard learning paradigm in on-line learning is the expert setting. In this setting, the algorithm maintains a distribution over a set of experts (or picks an expert from an implicitly maintained distribution). At each round, the loss assigned to each expert is revealed. The algorithm incurs the expected loss over the experts and then updates his distribution on the set of experts. The objective of the learner is to minimize his expected regret, which is defined as the cumulative loss of the algorithm minus the cumulative loss of best expert chosen in hindsight. There are several natural algorithms for this setting. One straightforward algorithm is the so-called follow-the-leader (FL) algorithm which consists of selecting an expert that minimizes the current cumulative loss. However, this algorithm can be shown not to admit a favorable worst case regret. An alternative algorithm with good regret guarantees is the Randomized Weighted Majority (RWM) [Littlestone and Warmuth, 1994] (or the Hedge algorithm [Freund and Schapire, 1997]). This algorithm maintains one weight per expert iwhich is proportional to exp(−ηLi), where Li is the current total loss of expert i and η a positive learning rate. Thus, in the RWM algorithm, the minimum of the FL algorithm is replaced by a “softmin”. An alternative algorithm with similar guarantees is the follow-the-perturbed-leader (FPL) algorithm [Kalai and Vempala, 2005] which first perturbs the total losses of the experts with a properly scaled additive noise and then picks the expert of minimum total perturbed loss. Both of these algorithms have a parameter (the learning rate or the scale factor of the additive noise). These parameters must be tuned in hindsight for the algorithms to achieve the optimal regret bound. More recently, we discovered an alternative perturbation which consists of dropping to zero the loss of each expert with probability one half. When FL is applied to the total “dropout” losses, the resulting algorithm achieves the optimum regret without tuning any parameter [van Erven et al., 2014]. Most learning problems arising in applications such as machine translation, automatic speech recognition, optical character recognition, or computer vision admit some structure. In these problems, the experts can be viewed as paths in a directed graph with each edge corresponding to a sub-structure corresponding to a word, phoneme, character, or image patch. This motivates our study of on-line learning with paths experts. Note that the number of path experts can be exponentially larger than the size of the graph. The learning guarantees of the best algorithms just mentioned remain informative in this context since their dependency on the number of experts is only logarithmic. But, the computational complexity of these algorithms also directly depends on the number of experts, which makes them in general impractical in this context. This problem has been extensively studied in the special case of additive losses where the loss of a path expert is the sum of the losses of its constituent edges. The Expanded Hedge (EH) algorithm of [Takimoto andWarmuth, 2003], an extension of RWM combined with weight pushing [Mohri, 2009], and an adaptation of FPL [Kalai and Vempala, 2005] to this context both provide efficient solutions to this problem with a polynomial-time complexity with respect to the size of the expert graph. For these algorithms, the range
منابع مشابه
On-Line Learning Algorithms for Path Experts with Non-Additive Losses
We consider two broad families of non-additive loss functions covering a large number of applications: rational losses and tropical losses. We give new algorithms extending the Followthe-Perturbed-Leader (FPL) algorithm to both of these families of loss functions and similarly give new algorithms extending the Randomized Weighted Majority (RWM) algorithm to both of these families. We prove that...
متن کاملFollow the Leader with Dropout Perturbations
We consider online prediction with expert advice. Over the course of many trials, the goal of the learning algorithm is to achieve small additional loss (i.e. regret) compared to the loss of the best from a set of K experts. The two most popular algorithms are Hedge/Weighted Majority and Follow the Perturbed Leader (FPL). The latter algorithm first perturbs the loss of each expert by independen...
متن کاملOpen Problem: Shifting Experts on Easy Data
A number of online algorithms have been developed that have small additional loss (regret) compared to the best “shifting expert”. In this model, there is a set of experts and the comparator is the best partition of the trial sequence into a small number of segments, where the expert of smallest loss is chosen in each segment. The regret is typically defined for worst-case data / loss sequences...
متن کاملOnline Multi-task Learning with Hard Constraints
We discuss multi-task online learning when a decision maker has to deal simultaneously with M tasks. The tasks are related, which is modeled by imposing that the M–tuple of actions taken by the decision maker needs to satisfy certain constraints. We give natural examples of such restrictions and then discuss a general class of tractable constraints, for which we introduce computationally effici...
متن کاملDesigning collaborative learning model in online learning environments
Introduction: Most online learning environments are challenging for the design of collaborative learning activities to achieve high-level learning skills. Therefore, the purpose of this study was to design and validate a model for collaborative learning in online learning environments. Methods: The research method used in this study was a mixed method, including qualitative content analysis and...
متن کامل